Boosting the Permutation Based Index for Proximity Searching
نویسندگان
چکیده
Proximity searching consists in retrieving objects out of a database similar to a given query. Nowadays, when multimedia databases are growing up, this is an elementary task. The permutation based index (PBI) and its variants are excellent techniques to solve proximity searching in high dimensional spaces, however they have been surmountable in low dimensional ones. Another PBI’s drawback is that the distance between permutations cannot allow to discard elements safely when solving similarity queries. In the following, we introduce an improvement on the PBI that allows to produce a better promissory order using less space than the basic permutation technique and also gives us information to discard some elements. To do so, besides the permutations, we quantize distance information by defining distance rings around each permutant, and we also keep this data. The experimental evaluation shows we can dramatically improve upon specialized techniques in low dimensional spaces. For instance, in the real world dataset of NASA images, our boosted PBI uses up to 90% less distances evaluations than AESA, the state-of-the-art searching algorithm with the best performance in this particular space.
منابع مشابه
Compact and Efficient Permutations for Proximity Searching
Proximity searching consists in retrieving the most similar objects to a given query. This kind of searching is a basic tool in many fields of artificial intelligence, because it can be used as a search engine to solve problems like kNN searching. A common technique to solve proximity queries is to use an index. In this paper, we show a variant of the permutation based index, which, in his orig...
متن کاملEfficient Group of Permutants for Proximity Searching
Modeling proximity searching problems in a metric space allows one to approach many problems in different areas, e.g. pattern recognition, multimedia search, or clustering. Recently there was proposed the permutation based approach, a novel technique that is unbeatable in practice but difficult to compress. In this article we introduce an improvement on that metric space search data structure. ...
متن کاملA Brief Index for Proximity Searching
Many pattern recognition tasks can modeled as proximity searching. From nearest neighbor classification to multimedia databases the common task is to quickly find all the elements close to a given query. This task can be accomplished very easily by sequentially examining all the elements in the collection, but turns to be impractical in two situations: when the distance used to compare elements...
متن کاملPP-Index: Using Permutation Prefixes for Efficient and Scalable Approximate Similarity Search
We present the Permutation Prefix Index (PP-Index), an index data structure that allows to perform efficient approximate similarity search. The PP-Index belongs to the family of the permutationbased indexes, which are based on representing any indexed object with “its view of the surrounding world”, i.e., a list of the elements of a set of reference objects sorted by their distance order with r...
متن کاملList of Clustered Permutations for Proximity Searching
The permutation based algorithm has been proved unbeatable in high dimensional spaces, requiring O(|P|) distance evaluations when solving similarity queries (where P is the set of permutants); but needs n evaluations of the permutant distance to compute the order to review the metric dataset, requires O(n|P|) space, and does not take much benefit from low dimensionality. There have been several...
متن کامل